Here my first analysis attempt. Since Tomlinson was never interested in anything but the critical-some condition, I will only focus on that sentence-type. That breaks down the problem and makes the visual assesment and stats wield-able (making words up as I go).
Okay, first, can we reproduce the AUC effect in the pilot data? Yes we can. Let’s plot the aggregated trajectories for Logical vs. Pragmatic training.
We can see that, in the logical condition, people move to the target in a direct straight line. In the pragmatic condition, they (on average) move into competitor space and then turn later towards the target. So all very much in line with Tomlinson and his two-step model.
I modelled the data as follows: scaled AUC predicted by centered Condition. Random effects… well not much complexity here (yay!). Its between-subjects, so only random intercept for subjects. Stimuli are also kinda useless atm, with more data we can estimate by-stimuli random slopes for condition, but for now we fair better with just random intercepts. That’s cool, as the models are quickly estimated. (see priors below)
If we run a model on these data analog to Tomlinson, we get the following fit:
Looking at the posterior predictive check, there is a bump on the right that throws of the estimation quite a bit. What could that bump be??? It’s of course categorically different movement trajectory types. (Mathias: we want the light blue lines to be similar to the dark blue line. If there is a strong divergence, the model doesn’t accurately model the data)
Okay, I estimated the optimal number of trajectory types using four commonly used k-selection methods. Cluster stability (stability), and the slope statistic (slope) are the only useful ones for our purposes. The former estimates 2 clusters, the latter estimates 4. So let’s check the data with clusters 2, 3 and 4:
(Question: How do we preregister this step for later?)
Okay so one trajectory type is kinda going straight to the target the other, on average, first moves to the competitor. But if you look at the actual trajectories (semi-transparent in the background), both clusters have a lot of stuff in there. Doesn’t strike me as a good categorization. Let’s look at three clusters:
Certainly better and as you can see, cluser 1 remains the same. The former cluster 2 is now split into two clusters, teasing apart dCoMs (2) and straight lines (3). What remains unclear to me is the difference between 1 and 3. Let’s try 4 clusters.
Okay that basically teases apart the former cluster 1 into 1 and 4 (2 and 3 remain untouched). Thats overall a good thing, because cluster 4 are these two participants that started to move their mouse upwards earlier, resulting in shorter trajectories.
I think none of these cluster proposals are perfect, but they are doing something objective (avoiding subjective decisions), and they at least parcel out these dCoMs in cluster 2. Good.
Another thing that becomes very clear is that these dCoMs (cluster 2) are much more frequent for the pragmatic listeners, ultimately creating an asymmetry in AUC. Running a multinomial model predicting cluster by condition (using the four cluster solution) starts to capture these differences, but more juice is obviously needed. Here the numbers.
| X1 | proportion | lci | uci | name | probability of beta > 0 | type | Condition | cluster |
|---|---|---|---|---|---|---|---|---|
| 1 | 0.87 | 0.51 | 1.00 | cluster1_logical | 1.00 | estimate | logical | cluster1 |
| 2 | 0.30 | 0.00 | 0.55 | cluster1_pragmatic | 1.00 | estimate | pragmatic | cluster1 |
| 3 | 0.04 | 0.00 | 0.11 | cluster2_logical | 1.00 | estimate | logical | cluster2 |
| 4 | 0.49 | 0.00 | 0.79 | cluster2_pragmatic | 1.00 | estimate | pragmatic | cluster2 |
| 5 | 0.06 | 0.00 | 0.29 | cluster3_logical | 1.00 | estimate | logical | cluster3 |
| 6 | 0.20 | 0.00 | 0.76 | cluster3_pragmatic | 1.00 | estimate | pragmatic | cluster3 |
| 7 | 0.03 | 0.00 | 0.10 | cluster4_logical | 1.00 | estimate | logical | cluster4 |
| 8 | 0.02 | 0.00 | 0.04 | cluster4_pragmatic | 1.00 | estimate | pragmatic | cluster4 |
| 9 | 0.57 | 0.15 | 0.97 | delta_cluster1 | 0.97 | delta | NA | cluster1 |
| 10 | -0.44 | -0.80 | -0.01 | delta_cluster2 | 0.02 | delta | NA | cluster2 |
| 11 | -0.14 | -0.90 | 0.27 | delta_cluster3 | 0.23 | delta | NA | cluster3 |
| 12 | 0.01 | -0.20 | 0.24 | delta_cluster4 | 0.57 | delta | NA | cluster4 |
Thus we should take these clusters into account when estimating AUC values. That’s what I did and the posterior predictive check is much better (still not great tho.
## Using 10 posterior samples for ppc type 'dens_overlay' by default.
If we look at the results and compare them to the first model (which ignored clusters), we can see that effects are clearly smaller. The first model estimated the effect of training to be around 1 standard deviation (Mathias: This is a very large effect). The model with clusters estimates the effects within cluster much smaller of course, but still noticable and all going into the same direction.
The effect magnitudes range from -0.14 to 0.486 SDs which is still quite solid.
## Warning: Missing column names filled in: 'X1' [1]
## Parsed with column specification:
## cols(
## X1 = col_double(),
## AUC = col_double(),
## lci = col_double(),
## uci = col_double(),
## Condition = col_character(),
## `probability of beta > 0` = col_double(),
## cluster = col_character()
## )
## # A tibble: 4 x 7
## X1 AUC lci uci Condition `probability of beta > 0` cluster
## <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <chr>
## 1 1 -0.486 -0.872 -0.107 delta_cluster1 0.01 cluster1
## 2 2 -0.299 -0.839 0.246 delta_cluster2 0.14 cluster2
## 3 3 -0.381 -0.894 0.154 delta_cluster3 0.07 cluster3
## 4 4 -0.141 -0.865 0.517 delta_cluster4 0.37 cluster4
```